feat: add trace stats computation by duncanpharvey · Pull Request #51 · DataDog/serverless-components

duncanpharvey · 2025-12-03T21:21:12Z

What does this PR do?

Add agent side trace stats computation in the Serverless Compatibility Layer
Disables tracer computed stats

Motivation

Some tracers support trace stats computation, but all don't support it, or have it disabled by default. Trace stats computation in the Serverless Compatibility Layer allows all environments supported by this package to have trace stats.

https://datadoghq.atlassian.net/browse/SVLS-8032

Additional Notes

Uses stats_concentrator_service with some refactors from bottlecap.
Disabled by default, can be enabled with DD_SERVERLESS_STATS_COMPUTATION_ENABLED=true
- Once agent computed stats is enabled by default, we can disable tracer computed stats with client_drop_p0s: false in agent /info endpoint until we sufficiently test tracer computed stats in Serverless environments
Agent computed stats sent with client_computed: false to stats intake

If the stats flusher stops then the entire mini agent shuts down, meaning that spans can no longer be added to the unbounded channel in the stats concentrator service

serverless-components/crates/datadog-trace-agent/src/mini_agent.rs

Lines 298 to 300 in 9894ef7

    
           result = &mut stats_flusher_handle => { 
        
               return Err(format!("Stats flusher task terminated unexpectedly: {result:?}").into()); 
        
           },

Updates libdatadog revision to DataDog/libdatadog@971c407
- feat(libdd-trace-utils): check for empty value in header datadog-client-computed-stats libdatadog#1900
- feat: add PartialEq and Eq traits to TracerMetadata libdatadog#1922
Span derived primary tags is on hold pending some updates to the spec
- feat(stats)!: implement span derived primary tags in libdd-trace-stats libdatadog#1815

Describe how to test/QA your changes

Unit tests for stats flusher to flush if tracer computed stats or serverless computed stats are available
Integration tests to compute stats in the serverless compatibility layer by default and to skip computation if the Datadog-Client-Computed-Stats is present in the trace payload
- Corresponding configuration in Go Agent: https://github.com/DataDog/datadog-agent/blob/867176f1b6b6f87f3aadb7a248d79c5fe9cd0bac/pkg/trace/agent/agent.go#L522-L524

duncanpharvey · 2026-04-01T20:48:28Z

@codex review

Copilot

Pull request overview

This PR adds trace stats computation capabilities to the Serverless Compatibility Layer. It enables the system to compute and send trace statistics when tracers haven't already computed them, enhancing stats availability across all supported serverless environments.

Changes:

Adds a new StatsConcentratorService module that manages span concentration and stats bucket generation using the libdd-trace-stats library
Adds a new StatsGenerator module that extracts metadata and spans from traces and sends them to the stats concentrator
Extends ServerlessTraceProcessor to optionally call the stats generator when the tracer hasn't already computed stats
Extends ServerlessStatsFlusher to flush both tracer-computed and serverless-computed stats
Adds integration tests to verify stats computation and to verify stats are skipped when client has already computed them
Adds configuration options for service, env, and version fields, and an environment variable to enable/disable stats computation

Reviewed changes

Copilot reviewed 12 out of 13 changed files in this pull request and generated no comments.

Show a summary per file

File	Description
`src/stats_concentrator_service.rs`	New module implementing stats concentration service with metadata tracking and flush logic
`src/stats_generator.rs`	New module for extracting trace information and sending to stats concentrator
`src/trace_processor.rs`	Modified to conditionally invoke stats generator based on client-computed-stats header
`src/stats_flusher.rs`	Modified to flush both channel and concentrator-based stats with proper flow control
`src/config.rs`	Added service, env, version fields and corresponding environment variable reading
`src/main.rs`	Added initialization of stats computation with environment variable toggle
`src/lib.rs`	Exported new modules
`src/mini_agent.rs`	Minor code style improvements (let-else pattern)
`Cargo.toml`	Added `thiserror` and `libdd-trace-stats` dependencies
`tests/integration_test.rs`	Added comprehensive integration tests for stats generation scenarios
`tests/common/mock_server.rs`	Updated to handle stats intake with proper 202 status code response
`tests/common/helpers.rs`	Extended send_tcp_request to support additional headers

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: ac8d049051

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Lewis-E · 2026-04-02T20:21:44Z

-                self.flush_stats(config.clone(), buffer.to_vec()).await;
-                buffer.clear();
+            // copy the batch for this flush
+            let channel_stats = buffer.to_vec();


Nit: LLM recommends let channel_stats = std::mem::take(&mut *buffer); to replace let channel_stats = buffer.to_vec(); buffer.clear(); but that's unrelated to the actual intent of the change here, so.

Applied this change!

Lewis-E · 2026-04-02T20:26:09Z

+                }
+                Ok(None) => {}
+                Err(e) => {
+                    error!("Failed to flush stats concentrator: {e}");


Do we want a log in a loop here or a error to be caught and have the whole process restarted/shut down? The trace flusher and stats flusher both have a a Err("<this code> task terminated unexpectedly".into());, would be consistent to have that for the concentrator as well.

I added a task terminated unexpectedly check for the span concentrator service to be consistent with these other background tasks.

…before asserting the results

…UTATION_ENABLED

apiarian-datadog · 2026-04-24T20:14:57Z

+    pub fn new(config: Arc<Config>) -> (Self, StatsConcentratorHandle) {
+        let (tx, rx) = mpsc::unbounded_channel();
+        let handle = StatsConcentratorHandle::new(tx);
+        // TODO: set span_kinds_stats_computed and peer_tag_keys


when will we do this todo?

I plan to work on it this week!

apiarian-datadog · 2026-04-24T20:20:16Z

+                // Drain client stats in buffer and stats from concentrator on interval
+                _ = interval.tick() => {
+                    let client_stats = std::mem::take(&mut buffer);
+                    if !client_stats.is_empty() || self.stats_concentrator.is_some() {


why do we check for the stats concentrator to be some here?

The stats flusher is either flushing trace stats received from the tracer in client_stats or trace stats computed by the agent in stats_concentrator. We want to flush if either one of these cases is true. I added more comments to clarify this.

apiarian-datadog · 2026-04-24T20:22:00Z

+                    tracer_language: tracer_payload.language_name.clone(),
+                    tracer_version: tracer_payload.tracer_version.clone(),
+                    hostname: String::new(),
+                    service_name: None,


the payload doesn't have a service name to pull?

Correct, the TracerPayload does not have a service name to pull.

While the TracerPayload struct does not have a service name to pull, we can pull the service name from _dd.base_service on the root span.

See #51 (comment)

apiarian-datadog · 2026-04-24T20:23:35Z

+                }
+            }
+        } else {
+            error!("Unsupported tracer payload version. Failed to send trace stats.");


we should probably say which version we saw that we don't know how to handle. i saw a proposal for a new tracer payload version coming down the pike

Added the unsupported tracer payload version to the error log.

# What does this PR do? Derives PartialEq and Eq on TracerMetadata, enabling equality comparisons between instances using `==` and `!=`. # Motivation Adding these standard library traits enables equality assertions without requiring callers to implement their own field by field comparison. Needed for DataDog/serverless-components#51 (comment) # Additional Notes # How to test the change? Unit tests Co-authored-by: duncan.harvey <duncan.harvey@datadoghq.com>

…tors

… computation

duncanpharvey force-pushed the duncan-harvey/trace-stats branch 2 times, most recently from a7773cc to 070ae12 Compare December 11, 2025 14:51

duncanpharvey force-pushed the duncan-harvey/trace-stats branch from 2056c19 to d5a3f08 Compare December 17, 2025 22:16

duncanpharvey force-pushed the duncan-harvey/trace-stats branch 2 times, most recently from 006cb8e to 4142915 Compare March 27, 2026 14:16

duncanpharvey mentioned this pull request Mar 27, 2026

feat(stats)!: implement span derived primary tags in libdd-trace-stats DataDog/libdatadog#1815

Open

duncanpharvey force-pushed the duncan-harvey/trace-stats branch from 5a64507 to 3562c93 Compare March 31, 2026 17:50

duncanpharvey marked this pull request as ready for review April 1, 2026 20:48

duncanpharvey requested a review from a team as a code owner April 1, 2026 20:48

Copilot AI review requested due to automatic review settings April 1, 2026 20:48

duncanpharvey requested a review from a team as a code owner April 1, 2026 20:48

duncanpharvey requested review from Lewis-E and litianningdatadog and removed request for a team April 1, 2026 20:48

Copilot started reviewing on behalf of duncanpharvey April 1, 2026 20:48 View session

Copilot AI reviewed Apr 1, 2026

View reviewed changes

chatgpt-codex-connector Bot reviewed Apr 1, 2026

View reviewed changes

Comment thread crates/datadog-trace-agent/src/stats_concentrator_service.rs

duncanpharvey requested a review from apiarian-datadog April 2, 2026 15:12

apiarian-datadog reviewed Apr 2, 2026

View reviewed changes

Lewis-E reviewed Apr 2, 2026

View reviewed changes

duncanpharvey force-pushed the duncan-harvey/trace-stats branch from 9e4544f to c2191f8 Compare April 7, 2026 15:50

duncanpharvey marked this pull request as draft April 9, 2026 20:51

duncanpharvey requested a review from lucaspimentel April 9, 2026 20:51

duncanpharvey force-pushed the duncan-harvey/trace-stats branch 3 times, most recently from df2f714 to 3b74133 Compare April 20, 2026 17:14

duncanpharvey marked this pull request as ready for review April 20, 2026 17:21

duncanpharvey mentioned this pull request Apr 21, 2026

feat(libdd-trace-utils): check for empty value in header datadog-client-computed-stats DataDog/libdatadog#1900

Merged

duncanpharvey added 6 commits April 24, 2026 11:18

drain in flight handlers prior to shutdown

85c33e2

refactor StatsGenerator into ServerlessTraceProcessor

1bd18c0

remove should_flush_stats_buffer from stats_flusher

5132865

only return errors rather than logging one error and returning another

4bd780f

wait for the mini agent to finish shutting down in integration tests …

0755452

…before asserting the results

add StatsConcentratorComponents struct

e46f752

duncanpharvey force-pushed the duncan-harvey/trace-stats branch from 3b74133 to e46f752 Compare April 24, 2026 18:23

duncanpharvey added 3 commits April 24, 2026 14:25

update licenses

8e83906

rename DD_SERVERLESS_STATS_COMPUTATION_ENABLED to DD_AGENT_STATS_COMP…

16044f1

…UTATION_ENABLED

check for multiple tracers in stats concentrator service

8b2e49f

duncanpharvey mentioned this pull request Apr 24, 2026

feat: add PartialEq and Eq traits to TracerMetadata DataDog/libdatadog#1922

Merged

apiarian-datadog self-requested a review April 24, 2026 20:03

apiarian-datadog approved these changes Apr 24, 2026

View reviewed changes

duncanpharvey added 3 commits April 27, 2026 10:32

add comment for client_drop_p0s

497cacd

add comments to clarify trace stats flushing

f6ed172

log version of unsupported tracer payload for trace stats computation

b7209e5

update libdatadog rev to 971c407d856db58baf1078bd7802abe13bac4f9f

92759ec

lucaspimentel reviewed Apr 27, 2026

View reviewed changes

Comment thread crates/datadog-trace-agent/src/config.rs Outdated

lucaspimentel reviewed Apr 27, 2026

View reviewed changes

Comment thread crates/datadog-trace-agent/src/config.rs Outdated

duncanpharvey mentioned this pull request Apr 28, 2026

feat: add Hash trait to TracerMetadata DataDog/libdatadog#1931

Open

duncanpharvey added 4 commits April 28, 2026 10:59

use service from spans, set default env, use multiple stats concentra…

0b0d5c2

…tors

fix env handling

66c4b0b

add comments

023674f

only use base service for stats concentrator

9894ef7

duncanpharvey requested a review from lucaspimentel April 28, 2026 15:39

lucaspimentel reviewed Apr 28, 2026

View reviewed changes

Comment thread crates/datadog-trace-agent/src/trace_processor.rs Outdated

duncanpharvey added 2 commits April 28, 2026 16:02

disable agent computed stats by default

bb29ed2

use trace_utils::get_root_span_index to get root span for trace stats…

a4e0f84

… computation

lucaspimentel approved these changes Apr 28, 2026

View reviewed changes

	result = &mut stats_flusher_handle => {
	return Err(format!("Stats flusher task terminated unexpectedly: {result:?}").into());
	},

Conversation

duncanpharvey commented Dec 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Motivation

Additional Notes

Describe how to test/QA your changes

Uh oh!

duncanpharvey commented Apr 1, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Lewis-E Apr 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

duncanpharvey commented Dec 3, 2025 •

edited

Loading

Lewis-E Apr 2, 2026 •

edited

Loading